The ethics of artificial intelligence covers a broad range of topics within AI that are considered to have particular ethical stakes. This includes , fairness, accountability, transparency, privacy, and regulation, particularly where systems influence or automate human decision-making. It also covers various emerging or potential future challenges such as machine ethics (how to make machines that behave ethically), lethal autonomous weapon systems, arms race dynamics, AI safety and AI alignment, technological unemployment, AI-enabled misinformation, how to treat certain AI systems if they have a moral status (AI welfare and rights), artificial superintelligence and existential risks.
Some application areas may also have particularly important ethical implications, like healthcare, education, criminal justice, or the military.
There are discussions on creating tests to see if an AI is capable of making . Alan Winfield concludes that the Turing test is flawed and the requirement for an AI to pass the test is too low. A proposed alternative test is one called the Ethical Turing Test, which would improve on the current test by having multiple judges decide if the AI's decision is ethical or unethical. Neuromorphic AI could be one way to create morally capable robots, as it aims to process information similarly to humans, nonlinearly and with millions of interconnected artificial neurons. Similarly, whole-brain emulation (scanning a brain and simulating it on digital hardware) could also in principle lead to human-like robots, thus capable of moral actions. And large language models are capable of approximating human moral judgments. Inevitably, this raises the question of the environment in which such robots would learn about the world and whose morality they would inherit – or if they end up developing human 'weaknesses' as well: selfishness, pro-survival attitudes, inconsistency, scale insensitivity, etc.
In Moral Machines: Teaching Robots Right from Wrong,
Some researchers frame machine ethics as part of the broader AI control or value alignment problem: the difficulty of ensuring that increasingly capable systems pursue objectives that remain compatible with human values and oversight. Stuart Russell has argued that beneficial systems should be designed to (1) aim at realizing human preferences, (2) remain uncertain about what those preferences are, and (3) learn about them from human behaviour and feedback, rather than optimizing a fixed, fully specified goal. Some authors argue that apparent compliance with human values may reflect optimization for evaluation contexts rather than stable internal norms, complicating the assessment of alignment in advanced language models.
The most predominant view on how bias is introduced into AI systems is that it is embedded within the historical data used to train the system. For instance, Amazon terminated their use of AI hiring and recruitment because the algorithm favored male candidates over female ones. This was because Amazon's system was trained with data collected over a 10-year period that included mostly male candidates. The algorithms learned the biased pattern from the historical data, and generated predictions where these types of candidates were most likely to succeed in getting the job. Therefore, the recruitment decisions made by the AI system turned out to be biased against female and minority candidates. According to Allison Powell, associate professor at LSE and director of the Data and Society programme, data collection is never neutral and always involves storytelling. She argues that the dominant narrative is that governing with technology is inherently better, faster and cheaper, but proposes instead to make data expensive, and to use it both minimally and valuably, with the cost of its creation factored in. Friedman and Nissenbaum identify three categories of bias in computer systems: existing bias, technical bias, and emergent bias. In natural language processing, problems can arise from the text corpus—the source material the algorithm uses to learn about the relationships between different words.
Large companies such as IBM, Google, etc. that provide significant funding for research and development have made efforts to research and address these biases. One potential solution is to create documentation for the data used to train AI systems. Process mining can be an important tool for organizations to achieve compliance with proposed AI regulations by identifying errors, monitoring processes, identifying potential root causes for improper execution, and other functions.
The problem of bias in machine learning is likely to become more significant as the technology spreads to critical areas like medicine and law, and as more people without a deep technical understanding are tasked with deploying it. Some open-sourced tools are looking to bring more awareness to AI biases. However, there are also limitations to the current landscape of fairness in AI, due to the intrinsic ambiguities in the concept of discrimination, both at the philosophical and legal level.
Facial recognition was shown to be biased against those with darker skin tones. AI systems may be less accurate for black people, as was the case in the development of an AI-based Pulse oximetry that overestimated blood oxygen levels in patients with darker skin, causing issues with their hypoxia treatment. Oftentimes the systems are able to easily detect the faces of white people while being unable to register the faces of people who are black. This has led to the ban of police usage of AI materials or software in some U.S. states. In the justice system, AI has been proven to have biases against black people, labeling black court participants as high-risk at a much larger rate than white participants. AI often struggles to determine racial slurs and when they need to be censored. It struggles to determine when certain words are being used as a slur and when it is being used culturally. The reason for these biases is that AI pulls information from across the internet to influence its responses in each situation. For example, if a facial recognition system was only tested on people who were white, it would make it much harder for it to interpret the facial structure and tones of other races and Ethnicity. Biases often stem from the training data rather than the algorithm itself, notably when the data represents past human decisions.
Injustice in the use of AI is much harder to eliminate within healthcare systems, as oftentimes diseases and conditions can affect different races and genders differently. This can lead to confusion as the AI may be making decisions based on statistics showing that one patient is more likely to have problems due to their gender or race. This can be perceived as a bias because each patient is a different case, and AI is making decisions based on what it is programmed to group that individual into. This leads to a discussion about what should be considered a biased decision in the distribution of treatment. While it is known that there are differences in how diseases and injuries affect different genders and races, there is a discussion on whether it is fairer to incorporate this into healthcare treatments, or to examine each patient without this knowledge. In modern society there are certain tests for diseases, such as breast cancer, that are recommended to certain groups of people over others because they are more likely to contract the disease in question. If AI implements these statistics and applies them to each patient, it could be considered biased.
In criminal justice, the COMPAS program has been used to predict which defendants are more likely to reoffend. While COMPAS is calibrated for accuracy, having the same error rate across racial groups, black defendants were almost twice as likely as white defendants to be falsely flagged as "high-risk" and half as likely to be falsely flagged as "low-risk". Another example is within Google's ads that targeted men with higher paying jobs and women with lower paying jobs. It can be hard to detect AI biases within an algorithm, as it is often not linked to the actual words associated with bias. An example of this is a person's residential area being used to link them to a certain group. This can lead to problems, as oftentimes businesses can avoid legal action through this loophole. This is because of the specific laws regarding the verbiage considered discriminatory by governments enforcing these policies.
However, AI can also be used in a positive way by helping to mitigate the environmental damages. Different AI technologies can help monitor emissions and develop algorithms to help companies lower their emissions.
However, making code open source does not make it comprehensible, which by many definitions means that the AI code is not transparent. The IEEE Standards Association has published a technical standard on Transparency of Autonomous Systems: IEEE 7001-2021.
There are also concerns that releasing AI models may lead to misuse. For example, Microsoft has expressed concern about allowing universal access to its face recognition software, even for those who can pay for it. Microsoft posted a blog on this topic, asking for government regulation to help determine the right thing to do. Furthermore, open-weight AI models can be fine-tuned to remove any countermeasure, until the AI model complies with dangerous requests, without any filtering. This could be particularly concerning for future AI models, for example if they get the ability to create bioweapons or to automate . OpenAI, initially committed to an open-source approach to the development of artificial general intelligence (AGI), eventually switched to a closed-source approach, citing competitiveness and safety reasons. Ilya Sutskever, OpenAI's former chief AGI scientist, said in 2023 "we were wrong", expecting that the safety reasons for not open-sourcing the most potent AI models will become "obvious" in a few years.
Aggressive AI crawlers have increasingly overloaded open-source infrastructure, "causing what amounts to persistent distributed denial-of-service (DDoS) attacks on vital public resources", according to a March 2025 Ars Technica article. Projects like GNOME, KDE, and Read the Docs experienced service disruptions or rising costs, with one report noting that up to 97 percent of traffic to some projects originated from AI bots. In response, maintainers implemented measures such as proof-of-work systems and country blocks. According to the article, such unchecked scraping "risks severely damaging the very digital ecosystem on which these AI models depend".
In April 2025, the Wikimedia Foundation reported that automated scraping by AI bots was placing strain on its infrastructure. Since early 2024, bandwidth usage had increased by 50 percent due to large-scale downloading of multimedia content by bots collecting training data for AI models. These bots often accessed obscure and less-frequently cached pages, bypassing caching systems and imposing high costs on core data centers. According to Wikimedia, bots made up 35 percent of total page views but accounted for 65 percent of the most expensive requests. The Foundation noted that "our content is free, our infrastructure is not" and warned that "this creates a technical imbalance that threatens the sustainability of community-run platforms".
In healthcare, the use of complex AI methods or techniques often results in models described as "Black box" due to the difficulty to understand how they work. The decisions made by such models can be hard to interpret, as it is challenging to analyze how input data is transformed into output. This lack of transparency is a significant concern in fields like healthcare, where understanding the rationale behind decisions can be crucial for trust, ethical considerations, and compliance with regulatory standards. Trust in healthcare AI has been shown to vary depending on the level of transparency provided. Moreover, unexplainable outputs of AI systems make it much more difficult to identify and detect medical error.
Not only companies, but many other researchers and citizen advocates recommend government regulation as a means of ensuring transparency, and through it, human accountability. This strategy has proven controversial, as some worry that it will slow the rate of innovation. Others argue that regulation leads to systemic stability more able to support innovation in the long term. The OECD, UN, EU, and many countries are presently working on strategies for regulating AI, and finding appropriate legal frameworks.
On June 26, 2019, the European Commission High-Level Expert Group on Artificial Intelligence (AI HLEG) published its "Policy and investment recommendations for trustworthy Artificial Intelligence". This is the AI HLEG's second deliverable, after the April 2019 publication of the "Ethics Guidelines for Trustworthy AI". The June AI HLEG recommendations cover four principal subjects: humans and society at large, research and academia, the private sector, and the public sector. The European Commission claims that "HLEG's recommendations reflect an appreciation of both the opportunities for AI technologies to drive economic growth, prosperity and innovation, as well as the potential risks involved" and states that the EU aims to lead on the framing of policies governing AI internationally. To prevent harm, in addition to regulation, AI-deploying organizations need to play a central role in creating and deploying trustworthy AI in line with the principles of trustworthy AI, and take accountability to mitigate the risks.
In June 2024, the EU adopted the Artificial Intelligence Act (AI Act). On August 1st 2024, The AI Act entered into force. The rules gradually apply, with the act becoming fully applicable 24 months after entry into force. The AI Act sets rules on providers and users of AI systems. It follows a risk-based approach, where depending on the risk level, AI systems are prohibited or specific requirements need to be met for placing those AI systems on the market and for using them.
AI has also seen increased usage in criminal justice and healthcare. For medicinal means, AI is being used more often to analyze patient data to make predictions about future patients' conditions and possible treatments. These programs are called clinical decision support systems (DSS). AI's future in healthcare may develop into something further than just recommended treatments, such as referring certain patients over others, leading to the possibility of inequalities.
Several labs have openly stated they are trying to create conscious AIs. There have been reports from those with close access to AIs not openly intended to be self aware, that consciousness may already have unintentionally emerged. These include OpenAI founder Ilya Sutskever in February 2022, when he wrote that today's large neural nets may be "slightly conscious". In November 2022, David Chalmers argued that it was unlikely current large language models like GPT-3 had experienced consciousness, but also that he considered there to be a serious possibility that large language models may become conscious in the future. Anthropic hired its first AI welfare researcher in 2024, and in 2025 started a "model welfare" research program that explores topics such as how to assess whether a model deserves moral consideration, potential "signs of distress", and "low-cost" interventions.
According to Carl Shulman and Nick Bostrom, it may be possible to create machines that would be "superhumanly efficient at deriving well-being from resources", called "super-beneficiaries". One reason for this is that digital hardware could enable much faster information processing than biological brains, leading to a faster rate of subjective experience. These machines could also be engineered to feel intense and positive subjective experience, unaffected by the hedonic treadmill. Shulman and Bostrom caution that failing to appropriately consider the moral claims of digital minds could lead to a moral catastrophe, while uncritically prioritizing them over human interests could be detrimental to humanity.
Weizenbaum explains that we require authentic feelings of empathy from people in these positions. If machines replace them, we will find ourselves alienated, devalued and frustrated, for the artificially intelligent system would not be able to simulate empathy. Artificial intelligence, if used in this way, represents a threat to human dignity. Weizenbaum argues that the fact that we are entertaining the possibility of machines in these positions suggests that we have experienced an "atrophy of the human spirit that comes from thinking of ourselves as computers."Joseph Weizenbaum, quoted in
Pamela McCorduck counters that, speaking for women and minorities "I'd rather take my chances with an impartial computer", pointing out that there are conditions where we would prefer to have automated judges and police that have no personal agenda at all. However, Andreas Kaplan and Haenlein stress that AI systems are only as smart as the data used to train them since they are, in their essence, nothing more than fancy curve-fitting machines; using AI to support a court ruling can be highly problematic if past rulings show bias toward certain groups since those biases get formalized and ingrained, which makes them even more difficult to spot and fight against.
Weizenbaum was also bothered that AI researchers (and some philosophers) were willing to view the human mind as nothing more than a computer program (a position now known as computationalism). To Weizenbaum, these points suggest that AI research devalues human life.
AI founder John McCarthy objects to the moralizing tone of Weizenbaum's critique. "When moralizing is both vehement and vague, it invites authoritarian abuse", he writes. Bill Hibbard writes that "Human dignity requires that we strive to remove our ignorance of the nature of existence, and AI is necessary for that striving."
In another incident on March 18, 2018, Elaine Herzberg was struck and killed by a self-driving Uber in Arizona. In this case, the automated car was capable of detecting cars and certain obstacles in order to autonomously navigate the roadway, but it could not anticipate a pedestrian in the middle of the road. This raised the question of whether the driver, pedestrian, the car company, or the government should be held responsible for her death.
Currently, self-driving cars are considered semi-autonomous, requiring the driver to pay attention and be prepared to take control if necessary. Thus, it falls on governments to regulate drivers who over-rely on autonomous features and to inform them that these are just technologies that, while convenient, are not a complete substitute. Before autonomous cars become widely used, these issues need to be tackled through new policies.
Experts contend that autonomous vehicles ought to be able to distinguish between rightful and harmful decisions since they have the potential of inflicting harm. The two main approaches proposed to enable smart machines to render moral decisions are the bottom-up approach, which suggests that machines should learn ethical decisions by observing human behavior without the need for formal rules or moral philosophies, and the top-down approach, which involves programming specific ethical principles into the machine's guidance system. However, there are significant challenges facing both strategies: the top-down technique is criticized for its difficulty in preserving certain moral convictions, while the bottom-up strategy is questioned for potentially unethical learning from human activities.
On October 31, 2019, the United States Department of Defense's Defense Innovation Board published the draft of a report recommending principles for the ethical use of artificial intelligence by the Department of Defense that would ensure a human operator would always be able to look into the 'black box' and understand the kill-chain process. However, a major concern is how the report will be implemented. The US Navy has funded a report which indicates that as military robots become more complex, there should be greater attention to implications of their ability to make autonomous decisions. Navy report warns of robot uprising, suggests a strong moral compass , by Joseph L. Flatley engadget.com, Feb 18th 2009. Some researchers state that might be more humane, as they could make decisions more effectively. In 2024, the DARPA funded a program, Autonomy Standards and Ideals with Military Operational Values (ASIMOV), to develop metrics for evaluating the ethical implications of autonomous weapon systems by testing communities.
Research has studied how to make autonomous systems with the ability to learn using assigned moral responsibilities. "The results may be used when designing future military robots, to control unwanted tendencies to assign responsibility to the robots." From a Consequentialism view, there is a chance that robots will develop the ability to make their own logical decisions on whom to kill and that is why there should be a set Morality framework that the AI cannot override.
There has been a recent outcry with regard to the engineering of artificial intelligence weapons that have included ideas of a AI takeover. AI weapons do present a type of danger different from that of human-controlled weapons. Many governments have begun to fund programs to develop AI weaponry. The United States Navy recently announced plans to develop autonomous drone weapons, paralleling similar announcements by Russia and South Korea respectively. Due to the potential of AI weapons becoming more dangerous than human-operated weapons, Stephen Hawking and Max Tegmark signed a "Future of Life" petition to ban AI weapons. The message posted by Hawking and Tegmark states that AI weapons pose an immediate danger and that action is required to avoid catastrophic disasters in the near future.
"If any major military power pushes ahead with the AI weapon development, a global arms race is virtually inevitable, and the endpoint of this technological trajectory is obvious: autonomous weapons will become the Kalashnikovs of tomorrow", says the petition, which includes Skype co-founder Jaan Tallinn and MIT professor of linguistics Noam Chomsky as additional supporters against AI weaponry.
Physicist and Astronomer Royal Sir Martin Rees has warned of catastrophic instances like "dumb robots going rogue or a network that develops a mind of its own." Huw Price, a colleague of Rees at Cambridge, has voiced a similar warning that humans might not survive when intelligence "escapes the constraints of biology". These two professors created the Centre for the Study of Existential Risk at Cambridge University in the hope of avoiding this threat to human existence.
Regarding the potential for smarter-than-human systems to be employed militarily, the Open Philanthropy Project writes that these scenarios "seem potentially as important as the risks related to loss of control", but research investigating AI's long-run social impact have spent relatively little time on this concern: "this class of scenarios has not been a major focus for the organizations that have been most active in this space, such as the Machine Intelligence Research Institute (MIRI) and the Future of Humanity Institute (FHI), and there seems to have been less analysis and debate regarding them".
Academic Gao Qiqi writes that military use of AI risks escalating military competition between countries and that the impact of AI in military matters will not be limited to one country but will have spillover effects. Gao cites the example of U.S. military use of AI, which he contends has been used as a scapegoat to evade accountability for decision-making.
A summit was held in 2023 in the Hague on the issue of using AI responsibly in the military domain.
Many researchers have argued that, through an intelligence explosion, a self-improving AI could become so powerful that humans would not be able to stop it from achieving its goals.Muehlhauser, Luke, and Louie Helm. 2012. "Intelligence Explosion and Machine Ethics" . In Singularity Hypotheses: A Scientific and Philosophical Assessment, edited by Amnon Eden, Johnny Søraker, James H. Moor, and Eric Steinhart. Berlin: Springer. In his paper "Ethical Issues in Advanced Artificial Intelligence" and subsequent book , philosopher Nick Bostrom argues that artificial intelligence has the capability to bring about human extinction. He claims that an artificial superintelligence would be capable of independent initiative and of making its own plans, and may therefore be more appropriately thought of as an autonomous agent. Since artificial intellects need not share our human motivational tendencies, it would be up to the designers of the superintelligence to specify its original motivations. Because a superintelligent AI would be able to bring about almost any possible outcome and to thwart any attempt to prevent the implementation of its goals, many uncontrolled unintended consequences could arise. It could kill off all other agents, persuade them to change their behavior, or block their attempts at interference.Bostrom, Nick. 2003. "Ethical Issues in Advanced Artificial Intelligence" . In Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence, edited by Iva Smit and George E. Lasker, 12–17. Vol. 2. Windsor, ON: International Institute for Advanced Studies in Systems Research / Cybernetics.
However, Bostrom contended that superintelligence also has the potential to solve many difficult problems such as disease, poverty, and environmental destruction, and could help humans enhance themselves.
Unless moral philosophy provides us with a flawless ethical theory, an AI's utility function could allow for many potentially harmful scenarios that conform with a given ethical framework but not "common sense". According to Eliezer Yudkowsky, there is little reason to suppose that an artificially designed mind would have such an adaptation.Yudkowsky, Eliezer. 2011. "Complex Value Systems in Friendly AI" . In Schmidhuber, Thórisson, and Looks 2011, 388–393. AI researchers such as Stuart J. Russell, Bill Hibbard, Roman Yampolskiy, Shannon Vallor, Steven Umbrello and Luciano Floridi have proposed design strategies for developing beneficial machines.
Prompt injection, a technique by which malicious inputs can cause AI systems to produce unintended or harmful outputs, has been a focus of these developments. Some approaches use customizable policies and rules to analyze inputs and outputs, ensuring that potentially problematic interactions are filtered or mitigated. Other tools focus on applying structured constraints to inputs, restricting outputs to predefined parameters, or leveraging real-time monitoring mechanisms to identify and address vulnerabilities. These efforts reflect a broader trend in ensuring that artificial intelligence systems are designed with safety and ethical considerations at the forefront, particularly as their use becomes increasingly widespread in critical applications.
Amazon, Google, Facebook, IBM, and Microsoft have established a non-profit, The Partnership on AI to Benefit People and Society, to formulate best practices on artificial intelligence technologies, advance the public's understanding, and to serve as a platform about artificial intelligence. Apple joined in January 2017. The corporate members will make financial and research contributions to the group, while engaging with the scientific community to bring academics onto the board.
The IEEE put together a Global Initiative on Ethics of Autonomous and Intelligent Systems which has been creating and revising guidelines with the help of public input, and accepts as members many professionals from within and without its organization. The IEEE's Ethics of Autonomous Systems initiative aims to address ethical dilemmas related to decision-making and the impact on society while developing guidelines for the development and use of autonomous systems. In particular, in domains like artificial intelligence and robotics, the Foundation for Responsible Robotics is dedicated to promoting moral behavior as well as responsible robot design and use, ensuring that robots maintain moral principles and are congruent with human values.
Traditionally, government has been used by societies to ensure ethics are observed through legislation and policing. There are now many efforts by national governments, as well as transnational government and NGO to ensure AI is ethically applied.
AI ethics work is structured by personal values and professional commitments, and involves constructing contextual meaning through data and algorithms. Therefore, AI ethics work needs to be incentivized.
The Romanticism period has several times envisioned artificial creatures that escape the control of their creator with dire consequences, most famously in Mary Shelley's Frankenstein. The widespread preoccupation with industrialization and mechanization in the 19th and early 20th century, however, brought ethical implications of unhinged technical developments to the forefront of fiction: R.U.R – Rossum's Universal Robots, Karel Čapek's play of sentient robots endowed with emotions used as slave labor is not only credited with the invention of the term 'robot' (derived from the Czech word for forced labor, robota)Kulesz, O. (2018). " Culture, Platforms and Machines". UNESCO, Paris. but was also an international success after it premiered in 1921. George Bernard Shaw's play Back to Methuselah, published in 1921, questions at one point the validity of thinking machines that act like humans; Fritz Lang's 1927 film Metropolis shows an android leading the uprising of the exploited masses against the oppressive regime of a Technocracy society.
In the 1950s, Isaac Asimov considered the issue of how to control machines in I, Robot. At the insistence of his editor John W. Campbell Jr., he proposed the Three Laws of Robotics to govern artificially intelligent systems. Much of his work was then spent testing the boundaries of his three laws to see where they would break down, or where they would create paradoxical or unanticipated behavior. His work suggests that no set of fixed laws can sufficiently anticipate all possible circumstances. More recently, academics and many governments have challenged the idea that AI can itself be held accountable. A panel convened by the United Kingdom in 2010 revised Asimov's laws to clarify that AI is the responsibility either of its manufacturers, or of its owner/operator.
Eliezer Yudkowsky, from the Machine Intelligence Research Institute, suggested in 2004 a need to study how to build a "Friendly AI", meaning that there should also be efforts to make AI intrinsically friendly and humane.
In 2009, academics and technical experts attended a conference organized by the Association for the Advancement of Artificial Intelligence to discuss the potential impact of robots and computers, and the impact of the hypothetical possibility that they could become self-sufficient and make their own decisions. They discussed the possibility and the extent to which computers and robots might be able to acquire any level of autonomy, and to what degree they could use such abilities to possibly pose any threat or hazard. They noted that some machines have acquired various forms of semi-autonomy, including being able to find power sources on their own and being able to independently choose targets to attack with weapons. They also noted that some computer viruses can evade elimination and have achieved "cockroach intelligence". They noted that self-awareness as depicted in science-fiction is probably unlikely, but that there were other potential hazards and pitfalls.
Also in 2009, during an experiment at the Laboratory of Intelligent Systems in the Ecole Polytechnique Fédérale of Lausanne, Switzerland, robots that were programmed to cooperate with each other (in searching out a beneficial resource and avoiding a poisonous one) eventually learned to lie to each other in an attempt to hoard the beneficial resource. Evolving Robots Learn To Lie To Each Other , Popular Science, August 18, 2009
Over time, debates have tended to focus less and less on possibility and more on desirability, as emphasized in the "Cosmist" and "Terran" debates initiated by Hugo de Garis and Kevin Warwick.
Liability for self-driving cars
Weaponization
Singularity
Solutions and approaches
Institutions in AI policy and ethics
Intergovernmental initiatives
Governmental initiatives
Academic initiatives
Private organizations
History
Role and impact of fiction
TV series
Future visions in fiction and games
See also
External links
|
|